The networks of syllables and characters in Chinese
نویسندگان
چکیده
We develop networks using the syllables (both base syllables and tonal syllables) and characters of Chinese. The nodes (vertices) of the networks represent the syllables of the syllable network and the characters of the character network respectively. The links (edges) are established between any two syllables (or two characters) that form part of one or more words. We use two dictionaries to perform the analysis: a Putonghua dictionary and a Cantonese dictionary. All networks here show low distances and high clustering coefficients compared with ER random networks. The degree distributions all follow a power-law; however, the exponents for the base syllable, tonal syllable and Chinese character networks differ considerably. These differences may account for the different cognitive processes used when constructing new Chinese words. The networks are compared to the syllabic networks of Portuguese in terms of the magnitude of the power-law exponent. The Chinese character network is found to be the most similar to the Portuguese syllabic network (g 1.4).
منابع مشابه
The Preliminary Results of a Mandarin Dictation Machine Based Upon Chinese Natural Language Analysis
This paper describes the preliminary results of the first research effort toward a Mandarin dictation machine in the world for the input of Chinese characters to computers. Considering the special characteristics of Chinese language, syllables are chosen as the basic units for dictation. The machine is divided into two subsystems. The first is to recognize the syllables using speech signal proc...
متن کاملGolden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary
AhtractThis paper describes the first successfully implemented real-time Mandarin dictation machine developed in the world which recognizes Mandarin speech with very large vocabulary and almost unlimited texts for the input of Chinese characters into computers. Considering the special characteristics of the Chinese language, syllables are chosen as the basic units for dictation. The machine is ...
متن کاملKeyboards for inputting Japanese language-arxiv
The most commonly used Japanese alphabets are Kanji, Hiragana and Katakana. The Kanji alphabet includes pictographs or ideographic characters that were adopted from the Chinese alphabet. Hiragana and Katakana are phonetic alphabets that do not include any characters common to each other or to Kanji. Hiragana is used to spell words of Japanese origin, while Katakana is used to spell words of wes...
متن کاملNeural Cognitions of Perceiving Chinese Characters: Phonological versus Logographical Effects
The study is aimed to study phonological versus logographical effects in neural cognitions when perceiving Chinese characters. The paradigm has a series of figural animal, ancient pictographic Chinese character, modern Chinese character, and mandarin-based syllables. Participants are grouped by two levels of word recognition skills, the mastered-group and the learner-group. The current study us...
متن کاملHuman Factors And Linguistic Considerations: Keys To High-Speed Chinese Character Input
With a keyboard and supporting system developed at Cornell University, input methods used to identify ideographs are adaptations of wellknown schemes; innovation is in the addition of automatic machine selection of ambiguously identified characters. The unique feature of the Cornell design is that a certain amount of intelligence has been built into the machine. This allows an operator to take ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Quantitative Linguistics
دوره 15 شماره
صفحات -
تاریخ انتشار 2008